Search CORE

436 research outputs found

bnstruct: an R package for Bayesian Network structure learning in the presence of missing data.

Author: Alberto Franzin
Alberto Franzin
Barbara Di Camillo
Francesco Sambo
Publication venue
Publication date: 21/12/2016
Field of study

Abstract Motivation A Bayesian Network is a probabilistic graphical model that encodes probabilistic dependencies between a set of random variables. We introduce bnstruct, an open source R package to (i) learn the structure and the parameters of a Bayesian Network from data in the presence of missing values and (ii) perform reasoning and inference on the learned Bayesian Networks. To the best of our knowledge, there is no other open source software that provides methods for all of these tasks, particularly the manipulation of missing data, which is a common situation in practice. Availability and Implementation The software is implemented in R and C and is available on CRAN under a GPL licence. Supplementary information Supplementary data are available at Bioinformatics online

Open Access Repository

A rule-based model of insulin signalling pathway

Author: Carlon Azzurra
DI CAMILLO BARBARA
EDUATI FEDERICA
TOFFOLO GIANNA MARIA
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

BACKGROUND: The insulin signalling pathway (ISP) is an important biochemical pathway, which regulates some fundamental biological functions such as glucose and lipid metabolism, protein synthesis, cell proliferation, cell differentiation and apoptosis. In the last years, different mathematical models based on ordinary differential equations have been proposed in the literature to describe specific features of the ISP, thus providing a description of the behaviour of the system and its emerging properties. However, protein-protein interactions potentially generate a multiplicity of distinct chemical species, an issue referred to as “combinatorial complexity”, which results in defining a high number of state variables equal to the number of possible protein modifications. This often leads to complex, error prone and difficult to handle model definitions. RESULTS: In this work, we present a comprehensive model of the ISP, which integrates three models previously available in the literature by using the rule-based modelling (RBM) approach. RBM allows for a simple description of a number of signalling pathway characteristics, such as the phosphorylation of signalling proteins at multiple sites with different effects, the simultaneous interaction of many molecules of the signalling pathways with several binding partners, and the information about subcellular localization where reactions take place. Thanks to its modularity, it also allows an easy integration of different pathways. After RBM specification, we simulated the dynamic behaviour of the ISP model and validated it using experimental data. We the examined the predicted profiles of all the active species and clustered them in four clusters according to their dynamic behaviour. Finally, we used parametric sensitivity analysis to show the role of negative feedback loops in controlling the robustness of the system. CONCLUSIONS: The presented ISP model is a powerful tool for data simulation and can be used in combination with experimental approaches to guide the experimental design. The model is available at http://sysbiobig.dei.unipd.it/ was submitted to Biomodels Database (https://www.ebi.ac.uk/biomodels-main/# MODEL 1604100005). ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12918-016-0281-4) contains supplementary material, which is available to authorized users

Springer - Publisher Connector

PubMed Central

Archivio istituzionale della ricerca - Università di Padova

Improving biomarker list stability by integration of biological knowledge in the learning process

Author: Aiolli Fabio
Bisognin Andrea
Da San Martino Giovanni
Di Camillo Barbara
Sanavia Tiziana
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

BACKGROUND: The identification of robust lists of molecular biomarkers related to a disease is a fundamental step for early diagnosis and treatment. However, methodologies for biomarker discovery using microarray data often provide results with limited overlap. It has been suggested that one reason for these inconsistencies may be that in complex diseases, such as cancer, multiple genes belonging to one or more physiological pathways are associated with the outcomes. Thus, a possible approach to improve list stability is to integrate biological information from genomic databases in the learning process; however, a comprehensive assessment based on different types of biological information is still lacking in the literature. In this work we have compared the effect of using different biological information in the learning process like functional annotations, protein-protein interactions and expression correlation among genes. RESULTS: Biological knowledge has been codified by means of gene similarity matrices and expression data linearly transformed in such a way that the more similar two features are, the more closely they are mapped. Two semantic similarity matrices, based on Biological Process and Molecular Function Gene Ontology annotation, and geodesic distance applied on protein-protein interaction networks, are the best performers in improving list stability maintaining almost equal prediction accuracy. CONCLUSIONS: The performed analysis supports the idea that when some features are strongly correlated to each other, for example because are close in the protein-protein interaction network, then they might have similar importance and are equally relevant for the task at hand. Obtained results can be a starting point for additional experiments on combining similarity matrices in order to obtain even more stable lists of biomarkers. The implementation of the classification algorithm is available at the link: http://www.math.unipd.it/~dasan/biomarkers.html

Crossref

Springer - Publisher Connector

PubMed Central

Archivio istituzionale della ricerca - Università di Padova

Institutional Research Information System University of Turin

A Boolean Approach to Linear Prediction for Signaling Network Modeling

Author: A Mitsos
A Subramanian
Alberto Corradin
B Di Camillo
B Di Camillo
Barbara Di Camillo
C Cobelli
D Jones
F Li
Federica Eduati
Gianna Toffolo
J Saez-Rodriguez
LG Alexopoulos
Mark Isalan
RJ Prill
Publication venue: Public Library of Science
Publication date: 01/01/2010
Field of study

The task of the DREAM4 (Dialogue for Reverse Engineering Assessments and Methods) “Predictive signaling network modeling” challenge was to develop a method that, from single-stimulus/inhibitor data, reconstructs a cause-effect network to be used to predict the protein activity level in multi-stimulus/inhibitor experimental conditions. The method presented in this paper, one of the best performing in this challenge, consists of 3 steps: 1. Boolean tables are inferred from single-stimulus/inhibitor data to classify whether a particular combination of stimulus and inhibitor is affecting the protein. 2. A cause-effect network is reconstructed starting from these tables. 3. Training data are linearly combined according to rules inferred from the reconstructed network. This method, although simple, permits one to achieve a good performance providing reasonable predictions based on a reconstructed network compatible with knowledge from the literature. It can be potentially used to predict how signaling pathways are affected by different ligands and how this response is altered by diseases

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Archivio istituzionale della ricerca - Università di Padova

Significance analysis of microarray transcript levels in time series experiments

Author: Cobelli Claudio
Di Camillo Barbara
Greenlund Laura J
Nair Sreekumaran K
Toffolo Gianna
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Background: Microarray time series studies are essential to understand the dynamics of molecular events. In order to limit the analysis to those genes that change expression over time, a first necessary step is to select differentially expressed transcripts. A variety of methods have been proposed to this purpose; however, these methods are seldom applicable in practice since they require a large number of replicates, often available only for a limited number of samples. In this data-poor context, we evaluate the performance of three selection methods, using synthetic data, over a range of experimental conditions. Application to real data is also discussed. Results: Three methods are considered, to assess differentially expressed genes in data-poor conditions. Method 1 uses a threshold on individual samples based on a model of the experimental error. Method 2 calculates the area of the region bounded by the time series expression profiles, and considers the gene differentially expressed if the area exceeds a threshold based on a model of the experimental error. These two methods are compared to Method 3, recently proposed in the literature, which exploits splines fit to compare time series profiles. Application of the three methods to synthetic data indicates that Method 2 outperforms the other two both in Precision and Recall when short time series are analyzed, while Method 3 outperforms the other two for long time series. Conclusion: These results help to address the choice of the algorithm to be used in data-poor time series expression study, depending on the length of the time series

Crossref

Springer - Publisher Connector

PubMed Central

Archivio istituzionale della ricerca - Università di Padova

Discriminant functional gene groups identiﬁcation with machine learning and prior knowledge

Author: Barla Annalisa
Di Camillo Barbara
Sanavia Tiziana
Squillario Margherita
Verri Alessandro
Zycinski Grzegorz
Publication venue: i6doc
Publication date: 01/01/2012
Field of study

Institutional Research Information System University of Turin

A quantization method based on threshold optimization for microarray short time series

Author: Cobelli Claudio
Di Camillo Barbara
Nair Sreekumaran K
Sanchez-Cabo Fatima
Toffolo Gianna
Trajanoski Zlatko
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

BACKGROUND: Reconstructing regulatory networks from gene expression profiles is a challenging problem of functional genomics. In microarray studies the number of samples is often very limited compared to the number of genes, thus the use of discrete data may help reducing the probability of finding random associations between genes. RESULTS: A quantization method, based on a model of the experimental error and on a significance level able to compromise between false positive and false negative classifications, is presented, which can be used as a preliminary step in discrete reverse engineering methods. The method is tested on continuous synthetic data with two discrete reverse engineering methods: Reveal and Dynamic Bayesian Networks. CONCLUSION: The quantization method, evaluated in comparison with two standard methods, 5% threshold based on experimental error and rank sorting, improves the ability of Reveal and Dynamic Bayesian Networks to identify relations among genes

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

TUGraz OPEN Library

Archivio istituzionale della ricerca - Università di Padova

Some like it hot: Thermal preference of the groundwater amphipod Niphargus longicaudatus (Costa, 1851) and climate change implications

Author: Cerasoli Francesco
Di Cicco Mattia
Di Lorenzo Tiziana
Fiasca Barbara
Galassi Diana Maria Paola
Galmarini Emma
Tabilio Di Camillo Agostina
Vaccarelli Ilaria
Publication venue: Elsevier
Publication date: 01/08/2023
Field of study

Groundwater is a crucial resource for humans and the environment, but its global human demand currently exceeds available volumes by 3.5 times. Climate change is expected to exacerbate this situation by increasing the frequency of droughts along with human impacts on groundwater ecosystems. Despite prior research on the quantitative effects of climate change on groundwater, the direct impacts on groundwater biodiversity, especially obligate groundwater species, remain largely unexplored. Therefore, investigating the potential impacts of climate change, including groundwater temperature changes, is crucial for the survival of obligate groundwater species. This study aimed to determine the thermal niche breadth of the crustacean amphipod species Niphargus longicaudatus by using the chronic method. We found that N. longicaudatus has a wide thermal niche with a natural performance range of 7–9 °C, which corresponds to the thermal regime this species experiences within its distribution range in Italy. The observed range of preferred temperature (PT) was different from the mean annual temperature of the sites from which the species has been collected, challenging the idea that groundwater species are only adapted to narrow temperature ranges. Considering the significant threats of climate change to groundwater ecosystems, these findings provide crucial information for the conservation of obligate groundwater species, suggesting that some of them may be more resilient to temperature changes than previously thought. Understanding the fundamental thermal niche of these species can inform conservation efforts and management strategies to protect groundwater ecosystems and their communities.info:eu-repo/semantics/publishedVersio

Universidade de Lisboa: Repositório.UL

An Optimized Data Structure for High Throughput 3D Proteomics Data: mzRTree

Author: Aebersold
Andrea Pietracaprina
Barbara Di Camillo
Deutsch
Francesco Silvestri
Francesco Tisiot
Gianna Maria Toffolo
Guttman
Hartler
Khan
Kyriacos
Lin
Martens
Orchard
Sara Nasso
Schulz-Trieglaff
Taylor
Vitter
Publication venue: 'Elsevier BV'
Publication date: 01/01/2010
Field of study

As an emerging field, MS-based proteomics still requires software tools for efficiently storing and accessing experimental data. In this work, we focus on the management of LC-MS data, which are typically made available in standard XML-based portable formats. The structures that are currently employed to manage these data can be highly inefficient, especially when dealing with high-throughput profile data. LC-MS datasets are usually accessed through 2D range queries. Optimizing this type of operation could dramatically reduce the complexity of data analysis. We propose a novel data structure for LC-MS datasets, called mzRTree, which embodies a scalable index based on the R-tree data structure. mzRTree can be efficiently created from the XML-based data formats and it is suitable for handling very large datasets. We experimentally show that, on all range queries, mzRTree outperforms other known structures used for LC-MS data, even on those queries these structures are optimized for. Besides, mzRTree is also more space efficient. As a result, mzRTree reduces data analysis computational costs for very large profile datasets.Comment: Paper details: 10 pages, 7 figures, 2 tables. To be published in Journal of Proteomics. Source code available at http://www.dei.unipd.it/mzrtre

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Università di Padova

Deep learning methods to predict amyotrophic lateral sclerosis disease progression

Author: Birolo Giovanni
Chiò Adriano
Di Camillo Barbara
Fariselli Piero
Manera Umberto
Pancotti Corrado
Rollo Cesare
Sanavia Tiziana
Publication venue
Publication date: 01/01/2022
Field of study

Amyotrophic lateral sclerosis (ALS) is a highly complex and heterogeneous neurodegenerative disease that affects motor neurons. Since life expectancy is relatively low, it is essential to promptly understand the course of the disease to better target the patient’s treatment. Predictive models for disease progression are thus of great interest. One of the most extensive and well-studied open-access data resources for ALS is the Pooled Resource Open-Access ALS Clinical Trials (PRO-ACT) repository. In 2015, the DREAM-Phil Bowen ALS Prediction Prize4Life Challenge was held on PRO-ACT data, where competitors were asked to develop machine learning algorithms to predict disease progression measured through the slope of the ALSFRS score between 3 and 12 months. However, although it has already been successfully applied in several studies on ALS patients, to the best of our knowledge deep learning approaches still remain unexplored on the ALSFRS slope prediction in PRO-ACT cohort. Here, we investigate how deep learning models perform in predicting ALS progression using the PRO-ACT data. We developed three models based on different architectures that showed comparable or better performance with respect to the state-of-the-art models, thus representing a valid alternative to predict ALS disease progression

PubMed Central

Institutional Research Information System University of Turin

Archivio istituzionale della ricerca - Università di Padova